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In recent work we have shown how it is possible to define very precise type systems for object- 
oriented languages by abstractly compiling a program into a Horn formula /. Then type inference 
amounts to resolving a certain goal w.rt. the coinductive (that is, the greatest) Herbrand model of /. 

Type systems defined in this way are idealized, since in the most interesting instantiations both 
the terms of the coinductive Herbrand universe and goal derivations cannot be finitely represented. 
However, sound and quite expressive approximations can be implemented by considering only regu- 
lar terms and derivations. In doing so, it is essential to introduce a proper subtyping relation formal- 
izing the notion of approximation between types. 

In this paper we study a subtyping relation on coinductive terms built on union and object type 
constructors. We define an interpretation of types as set of values induced by a quite intuitive relation 
of membership of values to types, and prove that the definition of subtyping is sound w.r.t. subset 
inclusion between type interpretations. The proof of soundness has allowed us to simplify the notion 
of contractive derivation and to discover that the previously given definition of subtyping did not 
cover all possible representations of the empty type. 

1 Introduction 

In recent work lH we have defined a framework which allows precise type analysis of object-oriented 
programs by means of abstract compilation of the program to be analyzed into a Horn formula (that is, 
a conjunction of Horn clauses). Then, type inference corresponds to resolving a certain goal (or query) 
w.r.t. the coinductive (that is, the greatest) Herbrand model of /. 

Coinductively defined terms of the Herbrand universe (which correspond to type expressions), in 
conjunction with the union type constructor, provide an abstract representation for arbitrary sets of val- 
ues, whereas coinductive SLD resolution [15, 14] allows type inference of recursive method invocation. 
However, type systems defined in this way are idealized, since, except for the most simple cases where 
types are just constants, in the most interesting instantiations both terms and goal derivations cannot be 
finitely represented. 

However, sound and quite expressive approximations can be implemented by considering only regu- 
lar types and derivations, that is, infinite terms and trees, respectively, which can be finitely represented. 
In doing so, it is essential to introduce a proper subtyping relation [2] formalizing the notion of approxi- 
mation between types, and a corresponding notion of subsumption at the level of goal derivation. In this 
way, regular types, which correspond to usual recursive types, are simply considered as approximations 
(that is, supertypes) of much finer infinite types which have no finite representation. 

This novel approach has several advantages: 
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• It offers a quite general and highly modular framework for type analysis of object-oriented pro- 
grams, where quite different kinds of analysis can be defined without changing the core inference 
engine based on coinductive SLD resolution empowered by the notions of subtyping and subsump- 
tion. Every instantiation corresponds to a particular choice of the type constructors, the abstract 
compilation schema, and the definition of the subtyping relation. Our previous papers provide 
several examples corresponding to different instantiations of the same framework ||2l |3l ; under 
this point of view, our proposal is an attempt to provide a common framework for reasoning on 
type analysis of object-oriented programs. Indeed, the solutions to the problem of type analysis of 
object-oriented programs which can be found in literature fT3l [T2l [Tl [TTl [T6l [TOl are often rather 
ad hoc, cannot be easily described in an abstract way, and, for these reasons, cannot be easily 
compared. 

• Several static analysis techniques for compiler optimization can be easily adopted for enhancing 
type analysis. For instance, we have shown fSl that a more precise type analysis can be obtained 
when abstract compilation is performed on programs in Static Single Assignment intermediate 
form iJ. 

• It promotes a nice integration between theory and practice, since type inference algorithms are just 
approximations of an idealized type system where its derivable type judgments can be expressed as 
the limits of chains of approximating judgments derivable by the algorithm, where their precision 
depends on the space and time resources available to the implementation. 

The definition of a suitable subtyping relation is of paramount importance to obtain reasonable ap- 
proximations of our framework, especially in the presence of union types, which have proved to be quite 
expressive when coinductive terms are considered. 

For this reason, in this paper we study a subtyping relation on coinductive terms built on union and 
object type constructors. Since types may be infinite, the relation is defined coinductively; however, such 
a definition is far from being intuitive, because a suitable notion of contractive |i6i |7 ] derivation has to be 
introduced to avoid unsound derivations. The contributions of this paper w.r.t. our previous work are the 
following: 

• We define an interpretation of types as set of values induced by a quite intuitive relation of mem- 
bership of values to types. 

• We prove that the definition of subtyping is sound w.r.t. subset inclusion between type interpreta- 
tions. The proof of soundness has allowed us to simplify the notion of contractive derivation for 
subtyping. 

• We have discovered that the previously given definition of subtyping did not cover all possible 
representations of types with an empty interpretation. Consequently, a new subtyping rule has 
been added, based on a complete characterization of empty types; such a characterization allowed 
us to define an algorithm for checking empty regular types. 

In Section|2]a gentle introduction to the framework is given by means of simple examples. Subtyping 
and type interpretation are defined in Section [3l whereas Section |4] is devoted to the proof of soundness. 
Section |5] deals with empty types, and, finally. Section [6]draws some conclusion. 

2 Abstract compilation into Horn formulas 

Let us consider the standard encoding of natural numbers with objects, written in Java-like code where, 
however, all type annotations have been omitted. 
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class Zero { 

add(n) { return n; } 

} 

class Succ { 
pr ed ; 

Succ(n) { t h i s . pr ed=n ; } 

add(n) { return pred.add(new Succ(n)); } 

} 

For simplicity, we just consider method add; class Succ represents all natural numbers greater than zero, 
that is, all numbers which are successors of a given natural number, stored in the field pred. 

In the abstract compilation approach a program, as the one shown above, is translated into a Horn 
formula where predicates encode the constructs of the language. For instance, the predicate invoke corre- 
sponds to method invocation, and has four arguments: the target object, the method name, the argument 
list, and the returned result. Terms represent either types (that is, set of values) or names (of classes, meth- 
ods and fields). In the instantiation we consider here, types include object types obj{c, \fi:t\,... ,fn-tn]), 
where c is the class of the object and /i , . . . ,/„ its fields with their corresponding types t\,. .. ,t„, union 
types t\ V t2, and primitive types as int. In the idealized abstract compilation framework, terms can be 
also infinite and non regulao a regular term is a term which can be infinite, but can only contain a finite 
number of subterms or, equivalently, can be represented as the solution of a unification problem, that is, 
a finite set of syntactic equations of the form X,- = tt, where all variables X,- are distinct and terms tt may 
only contain variables X, ||8] HH [14J. For instance, the term t s.t. t = int V f is regulaJl since it has only 
two subterms, namely, int and itself. 

Let us see some examples of regular types, that is, regular terms representing set of values. 

zer — obj{zero,[]) 

nat — zerV obj{succ,\pred:nat]) 

pos — obj{succ,\]}red:zer])\/ obj{succ,\pred:pos]) 

evn — zerW obj{succ,\pred:obj{succ,\pred:evn])]) 

odd = obj{succ,\pred:zer])\/ 

obj{succ, \]yred:obj{succ, \pred:odd])]) 

Type zer corresponds to all objects representing zero, while nat corresponds to all objects representing 
natural numbers and, similarly, pos, evn and odd to all objects representing positive, even, and odd natural 
numbers, respectively. An example of non regular types is given by the infinite sequence fi V (fa V (. . . V 
t „...)), where the term f, represents the prime number. 

Each method declaration is compiled into a single clause, defining a different case for the predicate 
hasjneth, that takes four arguments: the class where the method is declared, its name, the types of its 
arguments, including the special argument this corresponding to the target object, and the type of the 
returned value. Predicate hasjneth defines the usual method look-up: hasjneth{c,m, [this, ti,...,t„], t) 
succeeds if look-up of m from class c succeeds and returns a method that, when invoked on target object 
and arguments this, t\,. . . , t^, returns values of type t. 

For instance, the method declarations of the two classes defined above are compiled as follows: 

has_meth(zero , add , [This ,N] ,N) . 



'We refer to the author's previous work |4l|2l[3l for more details. 
^The exact meaning of such a term will be explained in the next section. 
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has_meth(succ , add ,[This,N],R) 
field_acc(This,pred,P) , 
new (succ , [N] , S) , 
invoke (P , add , [S] , R) . 

Predicates field^acc, new and invoke correspond to field access, constructor invocation and method invo- 
cation, respectively. Similarly to what happens for methods, each constructor declaration is also compiled 
into a clause. For instance, the following clause is generated from the constructor of class Succ: 

new(succ,[N],obj(succ,[pred:N|R])) extends(succ,P),new(P,[] ,obj(P,R)). 

In this case, since we knowHthat extends {succ, object) and new{object, [ ],obj{object, [])) hold, then we 
can derive new{succ, [N],obj{succ, \pred : A'^])). 

Other generated clauses are common to all programs and depend on the semantics of the language or 
on the meaning of types. 

invoke (T1VT2 ,M, A, R1VR2) invoke (Tl , M , A , Rl ) , invoke (T2 , M , A , R2 ) . 
invoke ( obj (C , R) ,M , A , Res) ^ has _meth (C , M , [ ob j (C , R) I A] , Res ) . 

The first clause specifies the behavior of invoke with union types. The invocation must be correct for 
both target types T\ and T2 and the returned type is the union of the returned types R\ and R2. When 
the target is an object type obj(C,R), then invocation of M with arguments A is correct if look-up of M 
with first argument obj{C,R), corresponding to this, and rest of arguments A succeeds when starting from 
class C. 

We show now that the goal invoke {evn, add, [odd],R) is derivable for 7? = ? where t is the regular 
type s.t. t = odd V t. If we take for granted that t is equivalen|3 to odd, then not only we can prove that 
adding an even and an odd number always returns an odd number, but we can also infer the thesis (that 
is, the result is an odd number), since the query corresponds to just asking which number is returned 
when adding an even and an odd number. 

We recall that, when considering the coinductive Herbrand model, derivations are allowed to be in- 
finite [|15.| . Then, since evn = zer\J obj {succ, \pred -.obj {succ, \pred:evn])]), by clause 1 for invoke we must 
show that invoke {zer, add, [odd], odd) and invoke{obj{succ,\pred:obj{succ,\pred:evn])]),add,[odd],t). 
The first atom can be derived by applying clause 2 for invoke, and then the clause for hasjneth generated 
from class Zero. For the second atom we apply clause 2 for invoke, and then the clause for hasjneth 
generated from class Succ and get invoke {obj {succ, \pred:evn]),add, [obj{succ, \pred:odd])],t). Then, if 
we re-apply the same clauses once again, we get invoke {evn, add, [succ^{odd)],t) (where succ^{odd) is 
just an abbreviation for obj{succ, \pred:obj{succ, \j?red:odd])\)) which is equal to the initial goal, except 
for the argument type which is succ^{odd) instead of odd. It is now clear that we can get an infi- 
nite derivation containing all atoms having shape invoke {evn, add, [succ^'^{odd)],t) for all « > 0, hence 
invoke {evn, add, [odd],t) is derivable. 

There are two main problems with the example of derivation given above: it is not regular, hence it 
cannot be computed, and we would like to resolve invoke {evn, add, [odd] , R) for R = odd rather than for 
R = t. To overcome these problems, a subtyping relation has to be introduced together with a notion of 
subsumption between atoms. The definition of the subtyping relation is postponed to the next section, 
however the intuition suggests that succ^{odd) < odd and t < odd should holdjl Furthermore, the fol- 
lowing subsumption relations are expected to hold: if succ^{odd) < odd, then invoke {evn, add, [odd],t) 

■^The set of all clauses generated from the two class declarations is available in the Appendix. 
^The equivalence between the two terms will be clarified in the next section. 

^More precisely, both directions of the two disequalities hold, since both pairs of terms are equivalent, but here we are only 
interested in one specific direction. 
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t <ti t <t2 ti <t t2<t 

(int) (VRl) (VR2) (VL) 

int <int f < f 1 V f2 f < fi V f2 f 1 V f2 < f 

fl < 4, < t'„ 

(obj)- 



obj{c, Ifi-.ti,... ,f„:t„, ...])< obj{c, \fi :t\ , . . . ,f„:t'„]) 

obj{c,\f:ui,fi:tu ■ ■ ■ ,fn-t„]) <t 
obj{c,\f:u2,fi:tu---,fn-t„]) <t 



(distr) - 



obi{c, [/■;«! VM2,/i:fi, ■■-,/«:?«]) < t 
Figure 1 : Rules defining the subtyping relation 



subsumes invoke {evn, add, [succ^{odd)],t), that is, subtyping is contravariant w.r.t. method arguments, as 
usual, and, therefore, if method add returns t when applied to argument odd, then it returns t when applied 
to any subtype of odd (in this specific case, succ^{odd)). On the other hand, subtyping is covariant w.r.t. 
the returned type, therefore if f < odd then invoke {evn, add, [odd],t) subsumes invoke {evn, add, [odd], odd), 
that is, if method add returns t when applied to odd, then it returns all supertypes of t as well (odd in this 
specific case). 

By introducing subtyping and subsumption it is possible to build a regular derivation for invoke{evn, 
add, [odd],t), by just observing that to prove invoke {evn, add, [odd],t) we need to prove invoke {evn, add, 
[succ^{odd)] , t) which, in turn, is subsumed by invoke {evn, add, [odd] , t), hence we can conclude the proof 
by coinductive hypothesis. Finally, by applying subsumption once more we can derive invoke {evn, add, 
[odd], odd) from invoke {evn, add, [odd],t). More in practice, this means that coSLD resolution ITSl can 
be generalized by taking into account subtyping constraints between terms, besides the usual unification 
constraints. 



3 Subtyping and type interpretation 

In this section we formally define subtyping as a syntactic relation between types; then we provide 
an intuitive interpretation of types as sets of values, to define a semantic counterpart of the subtyping 
relation. 

3.1 Definition of subtyping 

The types we consider are all infinite terms coinductively defined as follows: 

t ::= int[obi{c,[fi:ti,...,fn:tn])[t\\/t2 

An object type obj{c,\f\:t\, . . . ,fn:tn]) specifies the class c to which the object belongs, together with 
the set of available fields with their corresponding types. The class name is needed for typing method 
invocations. We assume that fields in an object type are finite, distinct and that their order is immaterial. 
Union types t\ V t2 have the standard meaning IBIITTI. 

The subtyping relation is coinductively defined by the rules in Figure [T] Rules are conceived for a 
purely functional setting [2J, an extension for dealing with imperative features can be found in another 
paper 01] by the same authors. 

Rules (VRl), (VR2) and (VL) specify subtyping between union types, and simply state that the union 
type constructor is the join operator w.r.t. subtyping. Note also the strong analogy with the left and right 



D. Ancom, G. Lagorio 



219 



logical rules of the classical Gentzen sequent calculus for the disjunction, when the subtping relation is 
replaced with the provability relation. 

Rule (obj) con^esponds to standard width and depth subtyping between object types: the type on the 
left-hand side may have more fields (represented by the ellipsis at the end), while subtyping is covariant 
w.r.t. the fields belonging to both types. Note that depth subtyping is allowed since we are considering a 
purely functional setting [31 . Finally, subtyping between object types is allowed only when they refer to 
the same class name. 

Rule (distr) expresses distributivity of object over union types; intuitively, object types correspond 
to Cartesian product which distributes over union: A x [B U C) = {A x B) U {A x C). For instance 
obj{c, \f:ti Vf2]) = obj{c, {f:ti]) \/ obj{c, \f:t2\), where ui = U2 holds iff ui < U2 and U2 < mi. The re- 
lation obj{c, \f:ti])Vobj{c, \f:t2]) < obj{c, \f:ti V 12]) can be derived by applying rules (VL), (obj), (VRl) 
and (VR2), and by the fact that ?i < fi V t2 and t2<t\\/ 12 hold by reflexivity, which is ensured by rules 
(int) and (obj). Rule (distr) is necessary for deriving the opposite direction of the relation, since by apply- 
ing rules (VRl), (VR2) and (obj) we end up with t\\J t2< h or t\\/ 12< ^2 which in general do not hold. 
Finally, note that rule (distr) is applicable only when the object type on the left-hand side has at least 
a tield associated with a union type; since order of fields is immaterial, in the rule such a field appears 
always in the first position for readability. 

A derivation is a tree where each node is a pair consisting of a judgment of the shape t\ < t2, and the 
label of a rul^ and where each node, together with its children, corresponds to a valid instantiation of a 
rule. For instance, the following tree 

{int < int, int) {int < int, int) 

{int V int < int, VL) 

is a derivation for int V int < int. However, in the rest of the paper we will use the following equivalent 
but more intuitive representation for derivations: 

(int)^ (int)- 

int < int int < int 

(VL) = = 

int V int < int 

Since subtyping is defined over infinite types, all rules must be interpreted coinductively, therefore 
derivations are allowed to be infinite. However, not all infinite derivations can be considered valid, but 
only those contractive ||6j|7l (see the definition below). To see why we need such a restriction, consider 
the regular type u s.t. u = u\J u, and the following infinite derivation containing just applications of rules 
(VRl) and (VR2): 



int < u 
int < u 

We reject infinite derivations built applying only rules (VRl) and (VR2), since they allow unsound judg- 
ments, as int < u derived above. As it will be shown in Section [l!2l u corresponds to the empty type, that 
is, to the bottom element ± w.r.t. the subtyping relation; indeed, for any type t there exists a contractive 
derivation for _L < f obtained by applying rule (VL) infinite times. 



''This labeling is necessary for thie proof of soundness. 
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(obj) 



V G fi V Gt2 

(int) (VL) (VR) 

ieint vefiVf2 vefiVfi 
VI e fi,...,v„ £ t„ 

obj{c, [/i H> v'l , . . . ,/„ 1-^ Vi, . . .]) e obj{c, \fi:ti,... ,fn-tn]) 
Figure 2: Rules defining membership 



Before giving the formal definition of contractive derivation, let us consider another example: if 
± is again the regular type s.t. ± = _L V _L, then the following infinite derivation, obtained by infinite 
applications of rule (distr), proves that obj{c, \fi:±,f2't]) < " for all u: 



objjc, {fi:±,f2-t]) < u objjc, \fi:±,f2-t]) < u 
objic,[fi:±,f2-t])<u 

Apparently this seems to be an unsound use of rule (distr) as it happens for rules (VRl) and (VR2) 
in the example above; however, this is not the case, as we formally prove in the next section. Since 
obj{c, \fi:-L,f2't]) < u and _L < m for all types u, then _L < obj{c, [/"i :_L,/2:/nf]) and obj{c, [fi :_L,/2 < -L 
hold, that is, the two types are equivalent and, therefore, both represent the empty type. This result is 
not so surprising if we interpret the empty type as the empty set of values, and we recall the similarity 
between records and Cartesian products, and the validity of the equation 0x^ = 0. 

Def. 3.1 A derivation for ti < t2 is contractive iff it contains no sub-derivations built only with rules 
(\/Rl ) and (\/R2). The subtyping relation t\ < t2 holds iff there is a contractive derivation for it. 

In the following we use the term derivation for contractive ones, unless explicitly specified. 
3.2 Interpretation of types 

We interpret types in a quite intuitive way, that is, as sets of values. Values are all infinite terms coinduc- 
tively defined by the following syntactic rules (where / G Z). 

V ::= / 1 obj{c, [fi ^ vi ,...,/„ vj) 

As happens for object types, fields in object values are finite and distinct, and their order is immaterial. 
Regular values correspond to finite, but cyclic, objects. 

Membership of values to (the interpretation of) types is coinductively defined by the rules of Figure[2l 
All rules are intuitive. Note that an object value is allowed to belong to an object type having less fields; 
this is expressed by the ellipsis at the end of the values in the membership rule (obj). 

An analogous notion of contractive derivation has to be enforced also for membership derivations. 

Def. 3.2 A derivation for v Gt is contractive iff it contains no sub-derivations built only with membership 
rules (VR), and (\/L). The membership relation v £ t holds iff there is a contractive derivation for it. 
The interpretation of type t is denoted by [?| and defined by {v \ v G t holds}. 

Before proving the main soundness theorem we show some examples of interpretations. 

Example 1 If _L is the regular type s.t. _L = _L V _L, then [_L]] = 0. Indeed, the only applicable rules are 
(VL) and (VR), hence only non contractive derivations can be built. 
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Example 2 If f is the regular type s.t. t = int V t, then |f] = \int\ = TL, that is, t and int have the same 
interpretation. Indeed, all the contractive derivations are obtained by applying n times (n > 0) rule (VR) 
(which is useless in this case), then rule (VL) followed by (int): 



i G int 
i G int V t 



i G int V t 

Example 3 Let us consider the infinite (but not regular) type ti defined by the following infinite set of 
equations (where ti corresponds to Xq): 

Xo = YoVXi 

Yo = obj{zew,{]) 

Yi = obj{succ,\pred:Yo]) 

Yn+i = obj{succ,lpred:Yn]) 

Let /2 be the term s.t. ?2 = obj{zero,[]) W obj{succ,\pred:t2]). Then C |f2|; indeed, it is easy 
to show that [[fij is the set of all objects representing natural numbers, and that such values belong 
to |?2l as well (all derivations are finite, hence trivially contractive), whereas the value Voo s.t. Voc = 
obj{succ, \pred i-^ Vco]) belongs to t2, but not to ti. Indeed, the following contractive and regular deriva- 
tion can be built by alternatively applying rules (VR) and (obj) infinite times. 



Voc g ?2 

Voo G obj{succ, \pred:t2\) 

Voo G ?2 

Finally, it is not difficult to prove that the only derivation for Voo G ti is not contractive, since it can be 
obtained by infinitely applying rule (VR); therefore Voo ?i. 

4 Soundness 

We now prove that the definition of < is sound w.r.t. containment between type interpretations. The 
proof of soundness is based on the following lemma. 

Lemma 4.1 Ift is an object type s.t. t <u and v G then there exists an object type t! (not necessarily 
equal to t) s.t. v G t', and s.t. there exists a derivation for t! <u whose first applied rule is (\/Rl), (MR!) 
or { obj). 
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Proof: The proposed proof is constructive, since it shows that the derivation for < m is just a sub- 
derivation of the derivation for t < u, and that the derivation for v £ t' can be easily built from the 
derivation for v £ t. 

Let t = obj{c, [fi:ti,... Jn'tn]), by membership rule (obj) v = obi{c, [fi vi , . . . ,/„ H- v„, . . .]); fur- 
thermore, the corresponding derivation has the following shape: 



• h . k„ 

V eobj{c,\fi:ti,...,fn-tn]) 

where are not union types, and are obtained after repeatedly applying rules (VL) or (VR) 

k\,. .. ,kn times respectively. We know that all ki are finite, otherwise the derivation would not be con- 
tractive. The proof proceeds by induction on m = Y.iei...n^i- 

If m = 0, then all t\,... ,tn are not union types. If u = int, then there are no applicable subtyping 
rules and the claim trivially holds since the hypothesis is not satisfied; if u is either a union or an object 
type, then the only applicable subtyping rules are (VRl), (VR2) or (obj), therefore we easily conclude 
with t' = t. If m > and the derivation is obtained by applying rul^ (distr), then fi = V tb, that is, 
t = obj{c, \fi :ta V tt,,... ,fn-tn\)- Furthermore, in the derivation for v £ t, the first applied rule of the sub- 
derivation for vi S V tb is either (VL) or (VR). If (VL) has been applied (the other case is completely 
symmetric), then a derivation for v € obj{c, \fi :ta, ■ ■ ■ can be obtained from that ofv£t, by simply 
removing the application of rule (VL) for vi G ta V tb, as depicted in Figure|3] Therefore in such derivation 
Hiei.-.n^i = m— I. Finally, since rule (distr) has been applied, we know that obj{c, \fi . . . ,/„:?„]) < u, 
hence we can conclude by inductive hypothesis. 

As a final remark, note that the construction of t' and of the derivations for t' < u and v E ?' are 
uniquely determined by the derivations for t <u and v £ t. Therefore, the proof of the lemma shows that 
there exists a function s.t. if di and d2 are derivations for t <u and v £ t, respectively, with t object 
type, then J^L{di,d2) returns {d-i,d^) s.t. dT, and ^4 are derivations for t' <u and v £ t' , respectively, 
where t' is an object type, J3 is a sub-derivation of d\ where the first applied rule is (VRl), (VR2) or 
(obj), and ^4 is obtained by d2 by replacing some node and removing some applications of rules (VL) 
and (VR). □ 

Theorem 4.1 (Soundness) For all t\,t2, ifti < t2, then [fij C [fj]- 



Proof: The claim can be put in the following equivalent form: for all t\ , t2,v if t\ <t2,v £ ti then v £ t2. 

The proof is constructive, since it coinductively defines a function from derivations for ?i < ?2 and 
V € fi to derivations for v £t2- The definition of ^ is given by cases on the first applied subtyping rule 
of the derivation for t\ < t2. 

Rule (int) ^ (('-)-^,ant)^) = mj^. 



^If one between (VRl), (VR2), and (obj) has been applied, then the conclusion is straightforward as for m = 0. 
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v'l e t'. 



(VL)- 



V'l € f« V ffc 

' e o/7/(c, \fV-ta\/tb, ■ ■ ■ ,fn-t„] 



k„ 



v„ e t„ 



veobj{c,\fi:ta,...Jn-t„]) 



Figure 3: Transformation of derivations in proof of lemma |4TT] 



^/ di \ ^(di,d2) 
Rule (VRl) ^ (vRi) ,d2 = (vl) , where di is a derivation for ti < u\, and d2 

\ ti<U\\/U2 J VGMiVM2 

is a derivation for v G fi . 

Rule (VR2) ^ ( (VR2) — ,(i2 ) = (vr) '^(^i'^^) ^ where (ii is a derivation for fi < U2, and 

\ t\<U\\/U2 J VGMiVM2 

is a derivation for v € fi . 

Rule (VL) There are two sub-cases, depending on the shape of the derivation for v ^t2: 
^ \ (VL) ^) ^ . (VL) ) = 3^{dx4^) 

\ U\\IU2<t2 v€MiVM2/ 
^ \ (VL) ^' ^ , (VR)— -^^^i-— ) = ^(^2,^/4) 

In this case and ^2 are derivations for mi < t2 and U2 <t2, respectively, whereas ds and <i4 are deriva- 
tions for V € Ml and v £ U2, respectively. 

Rule (obj) 

/ . di,...,dn ^ 

obj{c, [fi :mi , . . . ,fn-Un,- ■■]) < obj{c, \fi :u[ Jn'KW 



V ^"^^^ obj{c,\fi^vi,...,fn^Vn,...]) e obj{c, [fi :mi ,...,/„ :m„ ,...] ) / 

.^idi,d[),...,J^idn,d',) 

obj{c, Ifi^vi,... ,fn ^ v„, . . .]) G obj{c, \f[:ui,. . . ,f^:Un\) 



(obj) 



where d derivations for mi < , . . . , m„ < mJ,, respectively, whereas d\,... ,<i', are derivations 

for Vl G Ml , . . . , v„ G M„, respectively. 

The derivation for obj{c, \fi 1-^ vi ,...,/„ i-^- v„, .. .]) G obj{c, [fi :mi , . . . ,fn-Un,- ■ ■]) contains ellipses in 
the right hand side of the sub-derivations d[,... ,d'^ and of the fields of both the value and the type. Their 
meaning is that there may be other entities in the derivation which, however, can be omitted, since the 
definition of ^ does not depend on them. 
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Rule (distr) In this case the hypotheses of lemma 14.11 are verified, therefore we can use the function 
J^L defined in the proof of the lemma: 

where d\ is a derivation for ti < t2 whose first applied rule is (distr), hence ti is an object type, and d2 
is a derivation for v € fi. According to the proof of the lemma, ^L{d\,d2) returns {di,,diC) s.t. dT, and d^ 
are derivations for t <t2 and v G f, f is an object type, and the first applied rule of di, is (VRl), (VR2), or 
(obj). Therefore case (distr) is delegated to one of the three cases (VRl), (VR2), (obj) specified above. 

Now the remaining part of the proof is showing that ^ is well-defined. Since ^ is defined coin- 
ductively, we need to prove that ^ is a function, that is, it cannot return two different derivations when 
applied to the same arguments. To show this, we first prove the following property. 



Property (*) If d\ and d2 are derivations for t\ < ?2 and v € ?i, respectively, and {d\,d2) matches cases 
(VL) or (distr) of the definition of J^, then there always exist d^, and ^4 s.t. for any derivation d returned 
by ^{d\,d2), the following facts hold: d = J^{d3,d4), there exists t s.t. di, and ^4 are derivations for 
t <t2 and V £ t, respectively, and (^3,^4) matches one between (int), (VRl), (VR2), and (obj) cases. 



Proof of (*): It is immediate to prove that if di and d2 are derivations for ti < t2 and v € fi , respectively, 
then there always exists one and only one case matching {d\,d2) in the definition of If {d\,d2) 
matches case (distr), then by lemma |4~T] we know that is defined on {d\,d2), and returns {d3,d4) s.t. 
dj, and d4 are derivations for t < t2 and v £ t, where t is an object type, and the first applied rule of d^ 
is (VRl), (VR2) or (obj). Now, since (^1,^2) cannot match any other case, by definition of we can 
conclude that for any d returned by ^{d\,d2), the equality d = .^{^i^{d\,d2)) = J^{dT,,d4) must hold. 

If {d\,d2) matches case (VL), then we proceed by induction on the number n of contiguous applica- 
tions of membership rules (VL) and (VR) with which derivation d2 starts. We know that such n is finite, 
otherwise d2 would not be contractive. The basis if for « = 1, since for n = the pair (<ii ,^2) would not 
match case (VL); for simplicity, let us assume that d2 starts with the application of rule (VL), that is, the 
first sub-case applies (the other sub-case is symmetric). Then we know that d\ and J2 have the following 
shape: 

d^ d\ d4 

d\ = (VL) —r- do = (VL) 

tyt'<t2 vetyt' 

where dj and ^4 are derivations for t < t2 and v £ t, respectively. Since {d\,d2) cannot match any other 
case, by definition of ^ we have that for any d returned by ^{d\,d2), the equality d = J^{dT,,d4) must 
hold. Finally, (^3 , ^4) must match some case of the definition of but such case cannot be (VL); indeed, 
n = \ and, therefore, t cannot be a union type. In case {dj,,d4) matches case (distr), we can appl>[^ the 
result already proved for that case. The inductive step is a direct consequence of the inductive hypothesis 
and of the fact that if J2 starts with n + \ consecutive applications of rules (VL) and (VR), then d4 starts 
with n consecutive applications of rules (VL) and (VR). 
We can now prove the following property. 



is deterministic: For all d\ ,d2,d,d' , if J^{di ,d2) = d and ^(cfi , <3?2) = d', then d = d'. 
We prove that d = d' by induction on the height of the finite trees approximating d and d', that is, 
we show that all paths of d starting from its root are equal to the paths of d' starting from its root, for all 



^This is possible because proof of case (distr) does not depend on proof of case (VL). 
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the length^ of the paths. The basis consists in proving that d and d' have the same root and start with 
the same rule apphcation (that is, the path length is 0). This comes directly from the definition of ^ for 
the cases (int), (VRl), (VR2), and (obj), from the fact that all cases are disjoint, and from property (*) 
(which deals with the two remaining cases). The inductive step is derived from these same facts, from 
the inductive hypothesis, and from the standard definition of path length. 

^ returns contractive derivations: If d\ and d2 are derivations for ti < t2, v G ti, respectively, then 
^{di,d2) is defined and is a derivation for v G f2- 

First, we recall that the definition of ^ covers all possible cases, then ^ is always defined on 
{d\,d2)- Then we show that the tree returned by ^ is always a derivation, and finally we prove that all 
returned derivations are contractive. To prove that all returned trees are derivations, we first observe that 
^ always returns a tree having shape Again, this comes directly from the definition of ^ for the 
cases (int), (VRl), (VR2), and from property (*) (which deals with the two remaining cases). Then the 
proof proceeds by induction on the height of the finite derivations approximating ^{d\,d2)- That is, we 
prove that every node whose distancq^ from the root has length less or equal than n is obtained with 
a correct rule instantiation, for all n. The basis (for « = 0) comes directly from the definition of for 
the cases (int), (VRl), (VR2), and from property (*). Let us see case (VRl) as an example. In this case 
we know that ^{di,d2) = (vl)^^^^, where d^ is a derivation for <ui, and d^ is a derivation for 
V £ ti, therefore the root of J^{dT,,d4) is v £ u\, hence mi V M2 is obtained with a correct instantiation of 
rule (VL). The inductive step is derived from the definition of ^ for the cases (int), (VRl), (VR2), from 
property (*), from the inductive hypothesis, and from the standard definition of path length. 

We conclude the proof by showing that if di and d2 are contractive, then ,^{d\,d2) is contractive as 
well. By contradiction, let us assume that the returned derivation is not contractive, that is, there exists a 
sub-derivation containing just applications of memberships rules (VL) and (VR). Since (VRl) and (VR2) 
are the only two cases where an application of membership rule (VL) or (VR) is added to the returned 
derivation, and cases (VL) and (distr) may be defined in terms of cases (VRl) and (VR2), then such a 
sub-derivation can be built by applying only cases (VRl), (VR2), (VL) and (distr) of the definition of 
^. Now we observe that if case (distr) occurs, then, by definition of given in lemma l4~n and by 
definition of cases (VRl) and (VR2), only cases (VRl) and (VR2) may occur afterwards; but this means 
that d\ contains a sub-derivation built only with rules (VRl) and (VR2), that is, d\ is not contractive, 
which is in contradiction with the hypothesis. If case (distr) does not occur, and case (VL) occurs infinite 
times, then by definition of cases (VRl), (VR2), and (VL), we deduce that d2 is not contractive, against 
the hypothesis. The last possibility is when case (distr) does not occur, and case (VL) occurs only a finite 
numbers of time; but this necessarily means that at a certain point only cases (VRl) and (VR2) may 
occur, that is, d\ is not contractive, which is in contradiction with the hypothesis. □ 

5 A complete characterization of the empty type 

We have already shown in Section |3] that obi{c, \f\•.^-,f2'■t\) ^ where _L is the empty type, that is, the 
type s.t. _L = _L V _L; therefore, _L and obi{c, \fi^.^-,f2'A) equivalent. In fact, besides obi{c, [fi :_L, ...]), 
there are infinitely many other types equivalent to _L, namely, all object types "containing" _L. 

For instance, the type t = obj{ci , \f:obj{c2, [g:-L])]) is s.t. It} = 0. Unfortunately, f < _L is not deriv- 
able from the rules in Figure[T] Indeed, all possible derivations can be built by only applying rules (VRl) 



'^Recall that the path from the root to a given node is always finite, even when the tree is infinite, 
"where the distance is the length of the path from the node to the root. 



226 Coinductive subtyping for abstract compilation of object-oriented languages into Horn formulas 



and (VR2), and are, therefore, not contractive. To overcome this problem, we introduce a rule explicitly 
dealing with all types equivalent to the empty type. In order to do that, we would need a predicate 1 1± 
defining all types t equivalent to _L. However, the complementary predicate 1 1± turns out to be more 
convenient, because of its strong similarity with the membership relation; indeed, a type t is not equiv- 
alent to the empty type iff there exists a value v s.t. v G ? holds. In this way, it is quite straightforward 
to prove that the predicate 1 1± is sound and complete w.r.t. our type interpretation. Hence, our new 
subtyping rule is defined as follows. 

(empty) h t± 

h < h 

The definition of / 1± is quite straightforward. 

h t± h t± h t±i • • • ,fn t± 

(tVL)— — (tVR)— — (tint)^ — (tobj)- 



fiVf2t± fiVf2t± int^i_ oZ7/(c,[fi:fi,...,/„:f„])tx 

As usual, all derivations have to be contractive, hence they cannot contain sub-derivations obtained by 
only applying rules (f VL) and (t VR). 

Note that if we restrict ourselves to regular types, then the definition of t± can be turned into the 
following algorithm specified in pseudo-Java code. 



boolean not_empty (type stack path) { 
if U ■ is_visited() ) 

return path . is_contractive (?) ; 
else { 

t . set_visited () ; 
switch (?) { 

case int: return true; 
case ?i V ?2 : 

path . push (?) ; 
if (not_empty(?i .path) ) { 
path . pop () ; 
return true ; 

} 

res = iiot_empty (?2 , path) ; 
path . pop () ; 
return res ; 

case obj{c,\f\:h,...Jn-tn])- 
path . push (?) ; 
for / G 1 , . . . , n { 

if ( ! not_empty (?,- , path) ) { 
path . pop () ; 
return false ; 

} 

} 

path . pop ( ) ; 
return true ; 



> 

} 



The argument t is the type to be inspected, whereas path contains the stack of visited nodes, which must 
be initially empty. Such a stack is used for checking that the found derivation is contractive. Methods 
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is_visited and set_visited are used to keep track of visited terms, which correspond to nodes in a 
graph. If we end up with an already visited type, then we have an infinite regular path that, however, has 
to be contractive, otherwise the corresponding derivation is not valid: method is_coiitractive checks 
whether there is an object type in the sub-path of path from t to the top of the stack. The time complexity 
of the algorithm is linear in the number of edges of the graph representing the term, providing that 
is_contractive has a constant tim complexity. 

We can now prove that the definition of t_L is sound and complete w.r.t. the interpretation of types. 

Theorem 5.1 (Soundness of t t_L) Ift t_L, then {tj / 0. 



Proof: Similarly to the proof of Theorem 14.11 we coinductively define a function ^ mapping deriva- 
tions for t to derivations for v G f, for a fixed value v: 

iirA)- — = (int)- — ^ (VD— = (VL)- 



int t_L / G int \ fi V ?2 t-L / v G fi V ?2 

d \ ^(d) 

(VR)— — = (VR)- 



fl V f2 t± y V G V f2 



ohjic, [fi :fi , . . . ,/„ ) t± / ohjic, [fi vj ,...,/„ v„] ) G o&j(c, [/^i ,•• • ,/« ) 

Not that fully preserves the shape of derivations, in the sense that only the derived judgments change. 
Using a similar, but simpler, proof scheme as adopted for Theorem 14.11 it is possible to prove that the 
above definition corresponds to a function s.t. for all derivations d for t fj^, ^(d) is a derivation for 
V G f, for a certain v. □ 



Theorem 5.2 (Completeness of 1 1±) // \t\ ^ 0, then 1 1±. 



Proof: The proof is similar to that for soundness, except that here the function definition is even sim- 
pler, since it basically forgets the value v in the membership judgment. 

^ (int) — = (ini)-. :r- (VL) -— = (VL)- 



V G int) int t± V v & t\\J t2) V f2 t 

^, d \ ^{d) 

3^ (VR) = (VR)- 



V G fi V f2 y fi V ?2 tj 



^, d\,...,dn \ ^{di),...,3{dn) 



^ (obj) , -r TTT— FT 7 TT = (obj) 



ohjic, [fi ^ vi ,...,/„ v„]) G oZy(c, [fi , . • . Jn'tn]) J obj{c, [fi :fi , . . . ,/„:?„]) t± 

□ 

This final result allows us to fuUy reuse the proof of Theorem 14.11 to show that subtyping remains 
sound w.r.t. containment between type interpretations, if rule (empty) is added. 

Corollary 5.1 The subtyping relation coinductively defined by rules in Figure\l\ and by rule (empty) is 
sound w.rt. containment between type interpretations. 



"This can be achieved by associating a position with each node in the path, and by recording the minimum position p s.t. 
all paths starting from a node whose position is greater than p are non contractive. 
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Proof: It suffices considering tlie same function ^ defined in proof of Tlieorem |4.1[ since tlie new case 
(empty) cannot occur; indeed, there exist no derivations di and d2 for ti < t2 and v G t2, respectively, s.t. 
the first applied rule of di is (empty), because, by the side condition of rule (empty), t\ f^, and, hence, 
by Theorem EH [fil =0- □ 



6 Conclusion 

We have studied a subtyping relation on coinductive terms built on object and union types constructors, 
by providing a quite natural interpretation based on a membership relation of values to types, and proved 
that such a relation is sound w.r.t. containment between type interpretations. 

This study has allowed us to improve the original definition of subtyping IZj in two different direc- 
tions: 

• Contractiveness was too restrictive, since no derivations built only with (VRl), (VR2), and (distr) 
rules were allowed, whereas the type interpretation and the corresponding proof of soundness given 
here have shown that no restrictions on rule (distr) is ever needed. Consequently, the subtyping 
relation can be implemented more directly, since, rules (VRl) and (VR2) have only one premise, 
in contrast with (distr), and, therefore, checking contractiveness of derivations is simpler. 

• The definition did not consider all possible representations of the empty type. Consequently a cor- 
responding new rule has been added, and a sound and complete characterization of all representa- 
tions of the empty type has been provided; when restricted to regular types, such a characterization 
directly provides an algorithm for checking whether the interpretation of a type is empty. The time 
complexity of the algorithm is linear in the number of edges of the graph representing the term. 
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A Appendix: Horn clauses generated by the code examples in Section |2] 

The last clauses of has_f ield and has_meth are essential for correctly dealing with inherited fields and 
methods, respectively, even though they could be safely omitted here, since classes Zero and Succ do not 
inherit any field or method. Note that we have used negation just for brevity, but it can always be omitted 
by defining the trivial predicates not_dec_f ield and not_dec_meth, since dec_f ield and dec_meth are 
simply defined by a collection of ground facts. 

Finally, note that the definition of predicate f ield_acc (for field access) depends on the predicate 
rec_acc (for record access) which is defined by a single clause containing just a singleton record; 
this is correct thanks to subsumption and subtyping on record types. For instance, since the goal 
rec_acc( [f 1 : int] ,f 1 , int) is derivable, and [f 1 : int ,f 2: obj (c, [])] is a subtype of [f 1 : int] , then 
rec_acc( [f 1 : int ,f 2:obj (c, [] )] ,f 1 ,iiit) is derivable as well, by subsumption. 

class ( obj ect ) . 

class ( zero ) . 

class ( succ ) . 

extendsCzero , object) . 

extendsCsucc , object) . 

subclass (X , X) ^ class(X). 

subclass (X , obj ect ) class(X). 

subclass (X , Y) ^ ext ends (X , Z ) , subclas s (Z , Y ) . 

f ield_acc ( obj (C , R) , F , T) ^ has_f ield (C , F) , rec_acc (R , F , T) . 

f ield_acc (T1VT2 ,F , FT1VFT2) ^ field_acc(Tl,F,FTl),field_acc(Tl,F,FTl). 

rec_acc ( [F : T] ,F ,T) . 

invoke (obj (C , R) ,M , A , RT) ^ has _meth (C , M , [ ob j (C , R) I A] , RT ) . 

invoke (T1VT2 ,M , A , RT1VRT2) ^ invoke (T1,M,A,RT1), invoke (T2 ,M , A , RT2) . 

newCobject , [] ,obj (object , [] )) . 

new(zero , [] ,obj (zero ,R)) ^ extends(zero ,P) ,new(P, [] ,obj (P,R)) . 

new(succ , [N] ,obj (succ , [pred:N|R] )) <— extendsCsucc ,P) ,new(P, [] ,obj (P,R)) . 

dec_field(succ ,pred) . 
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has_f ield(C,F) 4- dec_f ield(C , F) . 

has_f ield(C , F) ext ends (C , P) , has_f ield (P , F) , -.de c_f ield (C , F ) . 

dec_meth(zero , add) . 

dec_meth(succ , add) . 

has _meth (zero , add , [This ,N] ,N) . 

has_ineth(succ , add , [This ,M] ,R) field_acc(This ,pred,P) ,new(succ , [N] ,S) , 

invoke (P , add , [S] , R) . 
has_meth(C,M,A,R) ^ extends (C , P) , has_ineth(P , M , A , R) , -idec_ineth (C , M) . 



